[ThinLTO][NFC] Refactor FileCache #110463

kyulee-com · 2024-09-30T08:04:35Z

This is a prep for #90933.

Change FileCache from a function to a type.
Store the cache directory in the type, which will be used when creating additional caches for two-codegen round runs that inherit this value.

- Turn it into a type from a function. - Store the cache directory for the future use.

llvmbot · 2024-09-30T14:17:16Z

@llvm/pr-subscribers-llvm-support

@llvm/pr-subscribers-lto

Author: Kyungwoo Lee (kyulee-com)

Changes

This is a prep for #90933.

Change FileCache from a function to a type.
Store the cache directory in the type, which will be used when creating additional caches for two-codegen round runs that inherit this value.

Full diff: https://github.com/llvm/llvm-project/pull/110463.diff

4 Files Affected:

(modified) llvm/include/llvm/LTO/LTO.h (+1-1)
(modified) llvm/include/llvm/Support/Caching.h (+21-1)
(modified) llvm/lib/LTO/LTO.cpp (+1-1)
(modified) llvm/lib/Support/Caching.cpp (+3-2)

diff --git a/llvm/include/llvm/LTO/LTO.h b/llvm/include/llvm/LTO/LTO.h
index 214aa4e1c562dc..a281c377f2601d 100644
--- a/llvm/include/llvm/LTO/LTO.h
+++ b/llvm/include/llvm/LTO/LTO.h
@@ -298,7 +298,7 @@ class LTO {
   ///
   /// The client will receive at most one callback (via either AddStream or
   /// Cache) for each task identifier.
-  Error run(AddStreamFn AddStream, FileCache Cache = nullptr);
+  Error run(AddStreamFn AddStream, FileCache Cache = {});
 
   /// Static method that returns a list of libcall symbols that can be generated
   /// by LTO but might not be visible from bitcode symbol table.
diff --git a/llvm/include/llvm/Support/Caching.h b/llvm/include/llvm/Support/Caching.h
index 4fa57cc92e51f7..cc86d1583fd6e6 100644
--- a/llvm/include/llvm/Support/Caching.h
+++ b/llvm/include/llvm/Support/Caching.h
@@ -54,9 +54,29 @@ using AddStreamFn = std::function<Expected<std::unique_ptr<CachedFileStream>>(
 ///
 /// if (AddStreamFn AddStream = Cache(Task, Key, ModuleName))
 ///   ProduceContent(AddStream);
-using FileCache = std::function<Expected<AddStreamFn>(
+using FileCacheFunction = std::function<Expected<AddStreamFn>(
     unsigned Task, StringRef Key, const Twine &ModuleName)>;
 
+struct FileCache {
+  FileCache(FileCacheFunction CacheFn, const std::string &DirectoryPath)
+      : CacheFunction(std::move(CacheFn)), CacheDirectoryPath(DirectoryPath) {}
+  FileCache() = default;
+
+  Expected<AddStreamFn> operator()(unsigned Task, StringRef Key,
+                                   const Twine &ModuleName) {
+    assert(isValid() && "Invalid cache function");
+    return CacheFunction(Task, Key, ModuleName);
+  }
+  const std::string &getCacheDirectoryPath() const {
+    return CacheDirectoryPath;
+  }
+  bool isValid() const { return static_cast<bool>(CacheFunction); }
+
+private:
+  FileCacheFunction CacheFunction = nullptr;
+  std::string CacheDirectoryPath;
+};
+
 /// This type defines the callback to add a pre-existing file (e.g. in a cache).
 ///
 /// Buffer callbacks must be thread safe.
diff --git a/llvm/lib/LTO/LTO.cpp b/llvm/lib/LTO/LTO.cpp
index a88124dacfaefd..be49b447f7dcf8 100644
--- a/llvm/lib/LTO/LTO.cpp
+++ b/llvm/lib/LTO/LTO.cpp
@@ -1483,7 +1483,7 @@ class InProcessThinBackend : public ThinBackendProc {
         return E;
     }
 
-    if (!Cache || !CombinedIndex.modulePaths().count(ModuleID) ||
+    if (!Cache.isValid() || !CombinedIndex.modulePaths().count(ModuleID) ||
         all_of(CombinedIndex.getModuleHash(ModuleID),
                [](uint32_t V) { return V == 0; }))
       // Cache disabled or no entry for this module in the combined index or
diff --git a/llvm/lib/Support/Caching.cpp b/llvm/lib/Support/Caching.cpp
index 1ef51db218e89c..66e540efaca972 100644
--- a/llvm/lib/Support/Caching.cpp
+++ b/llvm/lib/Support/Caching.cpp
@@ -37,8 +37,8 @@ Expected<FileCache> llvm::localCache(const Twine &CacheNameRef,
   TempFilePrefixRef.toVector(TempFilePrefix);
   CacheDirectoryPathRef.toVector(CacheDirectoryPath);
 
-  return [=](unsigned Task, StringRef Key,
-             const Twine &ModuleName) -> Expected<AddStreamFn> {
+  auto Func = [=](unsigned Task, StringRef Key,
+                  const Twine &ModuleName) -> Expected<AddStreamFn> {
     // This choice of file name allows the cache to be pruned (see pruneCache()
     // in include/llvm/Support/CachePruning.h).
     SmallString<64> EntryPath;
@@ -167,4 +167,5 @@ Expected<FileCache> llvm::localCache(const Twine &CacheNameRef,
           Task);
     };
   };
+  return FileCache(Func, CacheDirectoryPathRef.str());
 }

ellishg · 2024-09-30T16:47:27Z

llvm/include/llvm/Support/Caching.h

    unsigned Task, StringRef Key, const Twine &ModuleName)>;

+struct FileCache {


Should we move the docs for FileCacheFunction to FileCache since I think that will be the primary struct we should use. Also, we should add brief docs for FileCacheFunction to explain the arguments.

This feature is enabled by `-codegen-data-thinlto-two-rounds`, which effectively runs the `-codegen-data-generate` and `-codegen-data-use` in two rounds to enable global outlining with ThinLTO. 1. The first round: Run both optimization + codegen with a scratch output. Before running codegen, we serialize the optimized bitcode modules to a temporary path. 2. From the scratch object files, we merge them into the codegen data. 3. The second round: Read the optimized bitcode modules and start the codegen only this time. Using the codegen data, the machine outliner effectively performs the global outlining. Depends on #90934, #110461 and #110463. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.

This feature is enabled by `-codegen-data-thinlto-two-rounds`, which effectively runs the `-codegen-data-generate` and `-codegen-data-use` in two rounds to enable global outlining with ThinLTO. 1. The first round: Run both optimization + codegen with a scratch output. Before running codegen, we serialize the optimized bitcode modules to a temporary path. 2. From the scratch object files, we merge them into the codegen data. 3. The second round: Read the optimized bitcode modules and start the codegen only this time. Using the codegen data, the machine outliner effectively performs the global outlining. Depends on llvm#90934, llvm#110461 and llvm#110463. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.

[NFC] Refactor FileCache

925fc1b

- Turn it into a type from a function. - Store the cache directory for the future use.

kyulee-com requested review from jvoung, teresajohnson, mingmingl-llvm, NuriAmari and ellishg September 30, 2024 14:16

kyulee-com marked this pull request as ready for review September 30, 2024 14:16

llvmbot added LTO Link time optimization (regular/full LTO or ThinLTO) llvm:support labels Sep 30, 2024

kyulee-com mentioned this pull request Sep 30, 2024

[CGData][ThinLTO] Global Outlining with Two-CodeGen Rounds #90933

Merged

ellishg reviewed Sep 30, 2024

View reviewed changes

Address comments from ellishg

85d94e3

teresajohnson approved these changes Oct 4, 2024

View reviewed changes

kyulee-com merged commit ed59d57 into llvm:main Oct 4, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ThinLTO][NFC] Refactor FileCache #110463

[ThinLTO][NFC] Refactor FileCache #110463

Uh oh!

kyulee-com commented Sep 30, 2024

Uh oh!

llvmbot commented Sep 30, 2024 •

edited

Loading

Uh oh!

ellishg Sep 30, 2024

Uh oh!

Uh oh!

Uh oh!

		unsigned Task, StringRef Key, const Twine &ModuleName)>;

		struct FileCache {

[ThinLTO][NFC] Refactor FileCache #110463

[ThinLTO][NFC] Refactor FileCache #110463

Uh oh!

Conversation

kyulee-com commented Sep 30, 2024

Uh oh!

llvmbot commented Sep 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ellishg Sep 30, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Sep 30, 2024 •

edited

Loading